Images and Text can be used almost everywhere. They gain great popularity among documents of various formats. The very common topic that people talking about is that how to insert images and text in files. As time goes on, inserting images and text is far from satisfying people’s need. That is to say, sometimes, people have to extract images and text from various files.
Suppose someone wants only the images in the PDF document, and he/she cannot paste the image directly from the document, in such a case, he/she needs to extract the image from the PDF file. Personally, I think Spire.PDF is a good choice for those who would like to extract images or text from PDF Document in that Spire.PDF can do it only in a few minutes and the method is very simple and effective. Please follow the below procedure.
Freely Download Spire.PDF
Procedure
Suppose someone wants only the images in the PDF document, and he/she cannot paste the image directly from the document, in such a case, he/she needs to extract the image from the PDF file. Personally, I think Spire.PDF is a good choice for those who would like to extract images or text from PDF Document in that Spire.PDF can do it only in a few minutes and the method is very simple and effective. Please follow the below procedure.
Freely Download Spire.PDF
Procedure
Step1. Create a new project.
1.Create a new project in Visual Studio.
2.Set the Target Framework to be .NET Framework 4.
Step2. Add reference.
1.Choose Spire.PDF Dll and System Drawing as references in Project.
2.Add the following using at the top of the method.
C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Drawing;
using System.Drawing.Imaging;
using Spire.Pdf;
VB.NET:
Imports System.Collections.Generic
Imports System.Linq
Imports System.Text
Imports System.IO
Imports System.Drawing
Imports System.Drawing.Imaging
Imports Spire.Pdf
1.Create a new project in Visual Studio.
2.Set the Target Framework to be .NET Framework 4.
Step2. Add reference.
1.Choose Spire.PDF Dll and System Drawing as references in Project.
2.Add the following using at the top of the method.
C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Drawing;
using System.Drawing.Imaging;
using Spire.Pdf;
VB.NET:
Imports System.Collections.Generic
Imports System.Linq
Imports System.Text
Imports System.IO
Imports System.Drawing
Imports System.Drawing.Imaging
Imports Spire.Pdf
Step3. Extract images and text from the PDF document.
1.Create a PDF document and Load a file from the system.
C# Code:
//Create a pdf document.
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(@"C:\Program Files\e-iceblue\Spire.Pdf\Demos\Data\Sample2.pdf");
VB.NET Code:
'Create a pdf document.
Dim doc As New PdfDocument()
doc.LoadFromFile("C:\Program Files\e-iceblue\Spire.Pdf\Demos\Data\Sample2.pdf");
2.Extract images and text from PDF file.
1.Create a PDF document and Load a file from the system.
C# Code:
//Create a pdf document.
PdfDocument doc = new PdfDocument();
doc.LoadFromFile(@"C:\Program Files\e-iceblue\Spire.Pdf\Demos\Data\Sample2.pdf");
VB.NET Code:
'Create a pdf document.
Dim doc As New PdfDocument()
doc.LoadFromFile("C:\Program Files\e-iceblue\Spire.Pdf\Demos\Data\Sample2.pdf");
2.Extract images and text from PDF file.
C# Code:
StringBuilder buffer = new StringBuilder();
IList<Image> images = new List<Image>();
foreach (PdfPageBase page in doc.Pages)
{
buffer.Append(page.ExtractText());
foreach (Image image in page.ExtractImages())
{
images.Add(image);
}
}
doc.Close();
VB.NET Code:
Dim buffer As New StringBuilder()
Dim images As IList(Of Image) = New List(Of Image)()
For Each page As PdfPageBase In doc.Pages
buffer.Append(page.ExtractText())
For Each image As Image In page.ExtractImages()
images.Add(image)
Next image
Next page
doc.Close()
3.Save the images and text extracted from PDF file.
StringBuilder buffer = new StringBuilder();
IList<Image> images = new List<Image>();
foreach (PdfPageBase page in doc.Pages)
{
buffer.Append(page.ExtractText());
foreach (Image image in page.ExtractImages())
{
images.Add(image);
}
}
doc.Close();
VB.NET Code:
Dim buffer As New StringBuilder()
Dim images As IList(Of Image) = New List(Of Image)()
For Each page As PdfPageBase In doc.Pages
buffer.Append(page.ExtractText())
For Each image As Image In page.ExtractImages()
images.Add(image)
Next image
Next page
doc.Close()
3.Save the images and text extracted from PDF file.
C# Code:
//save text
String fileName = "TextInPdf.txt";
File.WriteAllText(fileName, buffer.ToString());
//save image
int index = 0;
foreach (Image image in images)
{
String imageFileName
= String.Format("Image-{0}.png", index++);
image.Save(imageFileName, ImageFormat.Png);
}
VB.NET Code:
'save text
Dim fileName As String = "TextInPdf.txt"
File.WriteAllText(fileName, buffer.ToString())
'save image
Dim index As Integer = 0
For Each image As Image In images
Dim imageFileName As String = String.Format("Image-{0}.png", index)
index += 1
image.Save(imageFileName, ImageFormat.Png)
Next image
Step 4. Launch the file and Press F5 to run the project.
C# Code:
//Launching the Text file.
System.Diagnostics.Process.Start(fileName);
VB.NET Code:
'Launching the Text file.
Process.Start(fileName)
Click to Full Demo
//save text
String fileName = "TextInPdf.txt";
File.WriteAllText(fileName, buffer.ToString());
//save image
int index = 0;
foreach (Image image in images)
{
String imageFileName
= String.Format("Image-{0}.png", index++);
image.Save(imageFileName, ImageFormat.Png);
}
VB.NET Code:
'save text
Dim fileName As String = "TextInPdf.txt"
File.WriteAllText(fileName, buffer.ToString())
'save image
Dim index As Integer = 0
For Each image As Image In images
Dim imageFileName As String = String.Format("Image-{0}.png", index)
index += 1
image.Save(imageFileName, ImageFormat.Png)
Next image
Step 4. Launch the file and Press F5 to run the project.
C# Code:
//Launching the Text file.
System.Diagnostics.Process.Start(fileName);
VB.NET Code:
'Launching the Text file.
Process.Start(fileName)
Click to Full Demo
Preview